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» . _ Abstract. We study Sobolev a priori estimates for the optimal transportation 

C^ , T = V^ between probability measures fi = e~ v dx and u = e~ w dx on M. d . 

Assuming uniform convexity of the potential W we show that f \\D 2 &Wjjg dfj,, 
where || • \\hs i s the Hilbert-Schmidt norm, is controlled by the Fisher infor- 
/-y-\ | mation of fi. In addition, we prove similar estimate for the L p (/i)-norms of 

||D 2< I > || and obtain some /^-generalizations of the well-known Caffarelli con- 
traction theorem. We establish a connection of our results with the Talagrand 
C^r, ' transportation inequality. We also prove a corresponding dimension-free ver 



<& 



sion for the relative Fisher information with respect to a Gaussian measure. 

Keywords: Monge-Kantorovich problem, Monge-Ampere equation, Sobolev a 
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1. Introduction 

kJ I Let fi — er v dx and v — e~ w dx be probability measures on R d and let T = V^ 

~f\ • be the optimal transportation mapping such that v is the image of \x with respect to 

T: v = [ioT^ 1 . In what follows we say for brevity that T sends (pushes forward) [i 
onto v. The corresponding convex potential is denoted by $. The reader is advised 
to consult [33] for an account in the optimal transportation theory. 
[ — ■ Assuming that W is uniformly convex (D 2 W > K ■ Id, K > ) we prove that 

(1) 1^.= J\^V\ 2 d^>K j\\D 2 n 2 HS d^. 
More generally, we show that for every unit e € M. d and p > 1 

(2) ^\\V?\\Lv W >mle\\L^). 



These results can be considered as (global, dimension-free) Sobolev a priori esti- 
mates for the following Monge-Ampere equation 

e -V = e -W(V*) detD 2$ 

The regularity theory for the Monge-Ampere operator has a quite long history. 
Many famous scientists contributed to this area. We advise the reader to consult 
[IB] (see also [2], [30], [14], [8], [23], [34]). In particular, some Sobolev a priori 
estimates for the optimal transportation have been obtained by L. Caffarelli in [B]. 
The most recent results in this direction are concerned with the Holder regularity 
of optimal transportation maps on manifolds (see [32] , [IS] , [TU] , [TO] , [13] and the 
references therein) . 
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The approach we use here is in a sense probabilistic. The estimates obtained 
in this paper are 1) dimension- free, 2) global, 3) can be obtained in a constructive 
way by integration-by-parts and above-tangential formalism. We refer to the works 
of N. Ivochkina (for instance, |17j ) for some similar arguments. In spite of the 
large amount of results, the only global dimension-free estimate known before was 
given by the Caffarelli contraction theorem [7]. According to this result every op- 
timal transportation T sending the standard Gaussian measure onto a log-concave 
measure v with uniformly convex W (i.e. D 2 W > K ■ Id with K > 0) is a -4= - 

contraction (i.e. ||T||lj p < -j=)- 

This contraction theorem has become very popular among probabilists because 
it gives immediately very nice analytical consequences (for instance, the Bakry- 
Ledoux theorem, a probabilistic version of the Levy-Gromov comparison theorem). 
Some recent generalizations can be found in [2T] , [H] , [33] • Another applicatons are: 
log-Sobolev and isoperimetric inequalities. By a recent observation of E. Milman 
(see [27], [28]), even weaker i p -estimates for ||Z? 2 $|| imply results of this type if 
the image measure is log-concave. 

We note (though it is not aim of this paper) that in this way one can also establish 
some Sobolev estimates for the third-order derivatives. Our estimates rely on the 
following (formal) identity: 

jV^.dn= f (D 2 ^-D 2 W{\7^)-D 2 ^>-e i ,e l ) dfi+ f\\(D 2 ^)-^D 2 ^ Xi {D 2 ^)-m 2 m dn. 

In particular, if $ is sufficiently smooth and D 2 W > K ■ Id, K > 0, then this 
identity implies ([1} and the following estimate for the third-order derivatives of $: 



/ | W| 2 dfi > 2%/A 7 f [^ \\D 2 <S> 



2 
\HS 



dfi. 



Another motivation for this study comes from the probability theory. It's worth 
noting that (Q]) appears to be very similar to the well-known Talagrand inequality 
(see [31]), which is a classical representative of the so-called transportation in- 
equalities (see surveys [23], [IS]), close relatives of various functional inequalities 
(concentration, Sobolev, isoperimetric, etc.). Let 7 be the standard Gaussian mea- 
sure. Consider the optimal transportation V<I> of g ■ 7 onto 7. Then the following 
(Talagrand or transportation inequality) holds 

(3) Ent 7 g> -VK 2 2 ( 7 , 5 • 7), where 



Ent 7 g= gloggdj, W 2 {i,g-i) = U \x - V$(x)| 2 ff d 7 ) 



1/2 



are the relative entropy and the Kantorovich distance. 

We recall that the Talagrand inequality follows from the so-called displacement 
convexity property of the entropy functional (see [1], [34]). Note in this respect 
that the energies (Fisher information etc.), unlike entropies, are NOT displacement 
convex. Nevertheless, in Section 3 we reveal a direct relation of ([1]) to ^j. First we 
prove the inequality 

(4) f(V(x + e) - V(x)) dn > — J |V$(x + e) - V$(a;)| 2 dfi, 



where e € M. d . It turns out that (U]) can be considered as a version of a generalized 
Talagrand-type inequality proved in [20 . Then we show that ((T|) follows from (|U) 
under a natural limiting procedure. 

In Section 5 we prove some dimension- free estimates of the type ([1]) . For instance, 
if /i = g ■ 7 (with smooth g) and v = 7, then 

I T5 = 2En V -2/lo g det 2 (^-Id)^7 

+ [\\D 2 $-ld\\ 2 HS gd 1 + J2 f ^[{D 2 ^)'^ 2 ^] 2 gd 1 , 

where I 7 <? = J - — — c/7 (relative information), det2(D 2 & — Id) = det Z? 2 $ • exp(d — 
A$) (the Fredholm-Carleman determinant of Z? 2 $ — Id). 

We note that all the terms in the right-hand side are non-negative. In particular, 
this identity implies the following stronger version of the log-Sobolev inequality 

and the following (essentially infinite-dimensional) analog of (p} 

I.g^JWDH-UWlsgdj. 

Note that the result stated in this form looks particularly relevant to the Talagrand 
inequality. See also Remark 15.31 below on uniqueness of the extremals for the 
classical log-Sobolev inequality. In addition, we prove some dimension-free results 
for the general log-concave reference measures. 

In Section 6 we prove several /^-generalizations of the main result. We prove 
that for every fixed unit vector e and p > 1 one has 

K\\^le\\LHp) < \\(Vee)+\\LP M , 

We emphasize that all these estimates can be obtained without any use of regularity 
theory. Instead of it we apply the change of variables formula from [26] and the 
above-tangential formalism. Note that the contraction theorem follows from these 
estimates and this is exactly the case when p — 00. In addition, in Section 7 we 
prove the following dimension- free estimate for the operator norm ||D 2 $| 



k[J \\D 2 n 2p d^) § < (| \\(D 2 v) + r dv 



Finally, we note that some of our results hold not only for the optimal transporta- 
tion mappings. For instance, they can be established for the so-called triangular 
mappings (see [1], [H]). See Section 2 and the forthcoming paper [2"2"] . 

The author thanks Luigi Ambrosio, Max-Konstantin von Renesse, Michel Ledoux, 
Emanuel Milman, and Frank Morgan for their interest and stimulating discussions. 
This work was partially done during the author's visit to the Technische Universitat 
Berlin under the support of the German Academic Exchange Service (DAAD). 



2. Heuristic proof 

In this section we give a formal computation of the main formula of our work. 
See Sections 3 and 4 for rigorous justifications. 

In what follows we denote by X^ the Fisher information of \x: 



Zm= J\VV\ 2 dfi 



and by ||A||#s = y/Tr(A ■ A T ) the Hilbert-Schmidt norm of a matrix A. For the 
operator norm we use the standard notation || • ||. It will be assumed throughout 
that I M < oo and that fi and v admit the finite second moments. The last condition 
is automatically satisfied for v if D 2 W > K ■ Id, K > 0. 

Let T be a mapping sending /x onto v. We assume that the potentials V, W are 
smooth, T : M. d — > M. d is a smooth diffcomorfism satisfying det DT > 0. By the 
change of variables formula 



(: 



-V 



= e- w ^ det DT. 



Taking the logarithm we obtain 

(5) V ^ W {T) -log det DT. 

Choose a unit vector e and differentiate ([5]) along e twice. To this end we apply 
the following fundamental relation 

5 e logdetL>T = Ti[DT e ■ (DT)- 1 ]. 

Differentiating once again and applying 

DT e ■ (DT)- 1 + DT ■ [(DT)- 1 ] e = 

we get 

8 ee log det DT = Tr[DT ee ■ (DT)- 1 ] - Tr[DT e ■ (DT)- 1 ] 2 . 
Coming back to ([5]) one gets 

V e = (\7W(T),T e )-Tr[DT e ■ (DT)' 1 ], 
(6) 
V ee = (D 2 W(T)-T e ,T e ) + (VW(T),T ee )~TT[DT ee ■ (DT)- 1 ] +Tr[DT e ■ (DT)- 1 ] 2 . 

Let us integrate © over /i. Clearly, J V ee dfi — J V 2 d[i. Let us show that after 
taking the integral the terms in the middle cancel each other. Indeed, let us denote 
S = T-\ One has 

(7) 

f{VW(T),T ee ) dfi = J(VW,T ee (S)) dv = J TrD[T ee (S)] dv 

= f Tr[DT ee (S) ■ DS] dv = f Tr[DT ee (S) ■ (DT)-\S)] dv 

(8) = / Tr[DT ee ■ (DT)- 1 ] dfi. 

Thus we get 

We are interested in two particular cases 
1) Optimal transportation mappings. 
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Optimal transportation mappings have the form T = V$, where $ is the convex 
function. In this case one has 



(9) / V 2 . d/i = / (D 2 <f>-D 2 W(y<P)-D 2 <5>-ei,ei} d^+ / Tr (D^)- X D 2 ^ X 



dfi. 



Note that the last integrand is non-negative and admits another representation 



Tr 



(D 2 $)- l D 2 <S> Xt =\\{D z <S>)-iD z <$> Xz (D z <S> 



IHS' 



Taking the sum over i we get 
(10) 



I„ = / Tr 



i=l 



-I l|2 



D 2 $ • D 2 W(y$) ■ D 2 $ dfi + ^2 / U^^y^^^xA^^y^Was dfJ~ 



2) Triangular mappings. 

Mappings of this type have the form 



T = (Ti(xi),T 2 (xi,x 2 ),-- ■ ,T d (xi,- ■■ ,Xd)), 



where every T; is increasing in Xi . 
It is easy to check that in this case 



(11) 



JVI d(x= f{D 2 W{T)-d x T,d x T) d^ + j^U^Y 1 ) 2 dn, 

■J *> I * J x k r5 



(12) 



^ = 



Tr 



DT ■ D 2 W(T) ■ (DT)* dn 



E 

k=l 



\V\nd Xk T k \ 2 dfi. 



3. Main result 
Recall that a function W is called uniformly convex (uniformly i^-convex) if 



(13) 



x h^ W(x) - —x 2 



is a convex function for K > 0. For a smooth W this is equivalent to the condition 
D 2 W > K ■ Id. Everywhere in this paper we deal with the case K > only. 

One can introduce in the standard way the weighted Sobolev spaces W 2,p (n). 
We say that / £ L 2 (fj,) admits a distributional derivative f Xi £ L 1 ^) if 



f£,x t dfi+ / fV Xi £ d/J, 



fxA d/.i = 

for every test function £. Similarly one can define W ' p (p) as a completion of the 
test functions in the corresponding Sobolev norm. It is known that W 2,p (fi) — 
Wq' v {h) if 1^ < oo (see Theorem 5.1 in [IT]). 

We denote by / + the function max{/, 0} and by A + the positive part of a 
symmetric matrix A (or zero matrix if A < 0). 

Theorem 3.1. Assume thatX^ < oo, [i admits the finite second moment, and W 
satisfies (7^> for some K > 0. Then $ £ W 2 ' 2 (fi) and 



(14) 



!„ > K 



\D 2 nls dfi. 



Proof. Step 1 (V and W are smooth). Assume, in addition, that V and W satisfy 
the following assumptions 

1) V, W £ C°°(R d ) and bounded from below 

2) D 2 V < c ■ Id for some ceR. 

By the Caffarelli's regularity results (see, for instance, Theorem 4.14 of [34] and 
some justification in [3T], Section 4) $ is smooth. Moreover, it follows by the 
Caffarelli-type arguments from 2) and the uniform convexity of W that 

sup \\D 2 <P{x)\\ < C 
xeR d 

for some C (see, for instance, Theorem 2.2 in [21] and an independent proof in 
Section 6 below). 

Let us show that ([9]) holds. We take a smooth compactly supported test function 
£. Multiply © by £ and integrate over \i. Apply integration-by-parts formula (see 
(O). One obtains 

(15) 

= /'(D 2 $-D 2 M/(V$)'D 2 $'e l ,e I K^+ [ \\(D 2 $)-^ D 2 $ Xz (D 2 <I>)-^ || J g £ dp 

+ fde£ ■ V Xi d\x + Av£, {D 2 ^)- x D 2 ^ Xi ■ et) dp,. 

Assume that £ has the form £ = r;(V<f>), where r\ is a test function. One has 
V£ = D 2 Q ■ Vry($). Using the uniform estimate of ||D 2 $|| one obtains 

\fd^-V Xi dp\ < C f \Vr)(V$)\\V Xi \ dn < CX} (f |Vt?| 2 dv)K 
To estimate the last term we integrate by parts 

Av£, (D 2 ®)-^ 2 ®^ ■ et) dp = Av77(V$), D 2 $ Xz • e 4 ) dp 

= - I ' {D 2 f 1 {V^)D 2 ^ ■ et, D 2 <$> ■ ei) dp + Avt?(V$), D 2 $> ■ <a)V Wi dfi. 
The latter does not exceed 

C 2 J ||£> 2 77(V*)|| dp + cf\Vr,(V$)\\V Xi \ dp 

<C 2 A|D 2 r/||dr/ + (7I'|(/'|Vr ? | 2 du)K 

Choosing a sequence of test function {rj n } such that < r\ n < 1, r] n — i 1 uniformly 
on every compact set, and |V?7„| 2 — > 0, ||-D 2 ?7„| — > in i 1 (^), we get © (hence 
(HH) for V,W satisfying l)-2). 

Step 2 (W is smooth). Fix a smooth uniform if-convex function W and ap- 
proximate p by smooth measures. We choose a sequence of functions {V^} such 
that every V n satisfies l)-2) . In addition, we assume that ^/p^ — > ^fp in W 1 ' 2 (M. d ), 
every p n = p n dx = e~ Vn dx is a probability measure, and sup„ J \x\ 2 dp n < oo. 

Note that there exists a subsequence of {V<f>„} (denoted again by {V<J> n }) 
such that V<I> n — > V$ almost everywhere. Indeed, let $f n be the convex con- 
jugated function to <!>„. Remind that V<I>„ and V^n are reciprocal. One has 
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sup n J jVvPnl 2 dv = sup„/|a:| 2 dp n < 00. We also require without loss of gen- 
erality that f^n dv — (note that \I/ n £ L 2 (v) by the Poncare inequality for 

uniform log-concave measures: K /(/ — J / dz/) di/ < / |V/| 2 dz/). Since VF is 
smooth, sup n J B iV^nl 2 dx < oo for every ball B r . Using compactness of Sobolev 
embeddings one can easily show that there exists an a.e. convergent subsequence 
(denoted again by {\l/„} ) *n — > *■ Since ^ n are convex, one also has V^„ — > VW 
a.e. This implies a.e. convergence of the convex conjugated potentials <!>„ — >• $ and 
their gradients V$„ -> V$ . 
Moreover, since 

y iiv$„h 2 d M „ = y ii.xii 2 dv= j iiv$h 2 d/z, 

one has V$„ • v / p^ — > V$ • y^ strongly in L 2 (R d ). In the same way one can check 
that (again up to a subsequence) d XiXj & n \/7>n converges weakly in L 2 (R d ) to some 
function F. This implies 



In the other hand 



ft-Wn^dx—f^-WnPndX-ft-Wr 



d Xj p n 



/Pn 

By the strong convergence V^ n y/Pn — > V^y/p the latter tends to 



/p^T dx. 



/ £x 3 - ■ d Xi <S> p dx- / £ ■ 5^$ d Xj p dx. 



The relation 



/ £ ■ F y/p dx = - £ Xj ■ d Xz $ p dx - / £ • <9 Xi $ d^p da; 

implies that the second distributional derivative d XiXj $> equals to F/y/p. Hence 
D 2 ®n ■ yfpH — >• D 2 $ ■ \/p weakly in L 2 (R d ). Since the statement holds for the 
approximating sequence (according to Step 1), by the standard property of the 
weak convergence 

X M = liml^ > lim„ y ||I> 2 $„|| 2 dp n > J \\D 2 m 2 dp. 

Step 3. At the final step we fix p and approximate e~~ w by smooth uniformly 
log-concave probability densities e~ Wn such that J \x\ 2 dv n — > J \x\ 2 dv and (TIB")) 
holds for every W n . The proof follows the arguments of Step 2. It is even easier 
because one has to deal with the fixed reference measure p. One obtains that 
V$„ -> V$ strongly in L 2 (p) and I? 2 $„ -> D 2 § weakly in L 2 (p). The result 
follows from the standard properties of the weak convergence. □ 

Remark 3.2. Third-order derivatives. Note that some global bounds on the 
third derivatives of $ are also available. Indeed, if $ is sufficiently smooth and © 
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holds, then 

J Vl dfi>K f WDH-e.W 2 dfi+ f \\(DH)-i D 2 ^> Xi (DH)-m 2 m dfi 

where || • || is the standard operator norm. Summing over i, bounding the opera- 
tor norm by the Hillbert-Schmidt norm, and applying the Cauchy inequality, one 
obtains 

J \vv\ 2 df,>2VK [[j^WDH^w^y d^. 

4. Transportation inequalities 

In this section we show that inequality ([I} follows from a (generalized) Talagrand 
inequality. 

The following generalization of the Talagrand inequality has been proved in [20] . 
Let / • v, g • v be probability measures, v = e~ w dx with D 2 W > K ■ Id, K > 0. 
Let Tf (T g ) be the optimal transportation mapping pushing forward / • v {g ■ v) 
onto v. Then the following inequality holds 



(16) 



J f log f -dv>^ f\T f -T g \ 2 fdv. 
Remark 4.1. The Talagrand inequality in its classical form 

(17) Jplogpdis>^J\T(x)-x\ 2 pdis 

holds for any reasonable transportation mapping T sending p-v onto v and satisfying 

(18) div^-^-rf-logdetD^ 1 ) >0 

(this can be checked by the standard transportational arguments, see, for instance, 
[24]). Then ([16]) follows from ^\ if we set 

p=l-o(T- 1 ), T = T f oT-\ 

Note that ([T8]) holds for T because D(T~ l ) is a composition of two non- negative 
matrix (see arguments below in the proof of Theorem I4.3[) . 

Let us apply flH]) to f(x) = e -V{x)+W{x) and g ^ = e -v(x+e)+w(x) ( e is a fixed 
vector). Clearly, Tf = V$ is the optimal transportation between p, and v and 
T g = V$(a; + ej. We obtain 

(V(x + e) - V(xj) dp > y J |V$(ir + e) - V$>{x)\ 2 dp. 

In order to make the paper self-contained, we give below an independent prove 
of this result. Then we deduce from it the main result of the paper (inequality ([1])). 

Recall that every convex function <p admits a.e. the so-called Alexandrov second- 
order derivative D 2 ip, which is the absolutely continuous part of its distributional 
derivative D 2 ip. 

The following lemma holds trivially for smooth mappings and can be easily 
checked by approximation arguments. 



Lemma 4.2. Let if : A —$■ R, tp : B —$■ R be convex functions on convex sets A, B. 
Assume that V^(-B) C A. Then 

div(V^ o Vip) > Tr[Dl<p(Vip) ■ D 2 a if\ dx > 0, 

where div is the distributional derivative. 

Theorem 4.3. Assume that W is K -uniformly convex. Then for every e G M d 

f(V(x + e) - V{x)) dn>— I |V$(x + e) - V$(x)| 2 dfi. 
Proof. By a result of R.J. McCann on the change of variables formula (see [26] or 

m) 



deta-D $ • e 



-W(V$) 



^i-a.e. Hence V = VK(V$) - logdet a Z? 2 $ and 

V(x + e)~ V(x) = W(V$(x + e)) - W(V*(a;)) 



log 



(de^D 2 ^))" 1 • det a L> 2 $(:r. + e) 



By the A-uniform convexity of W 

W(V$(x + e)) - W(V$(x)) > (V*(x + e) - V$(x), VW(V$(x))) 



A' 



V$(:c + e)-V$(x)| 2 . 



This implies 

/(F(x + e) - V(x)) d/z > / y|V$(x + e) - V$(x)| 2 dfj, 

(V$(x + e) - V$(x), VW(V$(x))) d// 

log[(det a .D 2 $(x)) -1 • det Q D 2 $(a; + e 



d/i. 

Denote by ^ = $* the convex conjugated function of $. Using the fact that V\t 
and V$ are reciprocal we get 

[(V<$>(x + e) - V$(x), VW(V$(x))) d^i = /(V$(W(x) + e) - x, W(i)) ^ 

= /div(V$(V*(a;)+e)-x)e- M/ , 

where div(V$(V x P(a;) + e) — x) is the distributional derivative of the vector field 
V$(W(x) + e) - x. 

By Lemma 14.21 and the relation (Z? 2 $(V*)) _1 = D 2 ^ which holds v-a.e. (see 
[26] or [33), we get 

/"div(V$(W(x)+e)-x)e- w > /(TrD 2 $(V*(x) + e) • D 2 a ^{x) - d) dv 

= J{TrD 2 a <S>(x + e) ■ (D^x))- 1 - d) d/*. 



It remains to note that 
TrD 2 a $(x + e) • (D^ix))- 1 -d- log[det a £> 2 $(x) • (det a D 2 $(a; + e))" 
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>0. 



Indeed, if A and B are symmetric and non-negative, then 

Tr AB - d - log det AB = TrC - d - log det C, 

where C = B l ' 2 AB 1 ' 2 is a symmetric non-negative matrix. It is well-known that 
TrC — d — log det C > 0. Indeed, the latter is equal to J2i( c i~ 1 — l°gCi) > 0, where 
Ci are eigenvalues of C. The proof is complete. □ 

Proposition 4.4. Inequality |7J) implies fT]). 



Proof. Following the arguments of Theorem 13. II we see that it is sufficient to estab- 
lish implication (j4]) => (PJ for a nice potential V. By Theorem l4.3l 

V(x + te) + V(x - te) - 2V(x) , 
fi d " 

> —J /Y||V$(x + te)- V$(x)|| 2 + ||V$(x-te)- V$(x)|| 2 ) d/i. 

Thus, without loss of generality we may assume that 1/ satisfies 
eMV(x)-V(x + te))-l t Tr s _ r2 



and 

1 



V e in L 2 (^), te^O 



lim t ^o— / (V(a; + te) + 7(x - te) - 2V(x)) d/i = / V ee dfj, 

for every e. Extract L 2 (/i)-weakly convergent subsequences { — — — y^ — } (we 

keep the same index n). Note that 

v*(*+*»e)-v*(«) ^ = /^^MzM^, 



V«(x) C «P(^)-^-*ne))-l ^ 



Obviously, the latter tends to 

- /*V<I>(x)£ e dfi+ /V$(x) £ V e dfi. 



Hence 

V$(x±i n e)- V$(x) 



V$,. 



weakly in £ 2 (/i). By the properties of the weak convergence. 
/V e 2 d/x = IV ee dn>K /"||V$ e || 2 ^. 
Applying this to every e^ and taking the sum we complete the proof. □ 

5. Dimension-free inequalities 

In this section we prove some essentially infinite-dimensional estimates (which 
do not contain dimension-dependent constants and make sense in the infinite- 
dimensional case). The results below also hold (with certain modifications) for 
the triangular mappings. 
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5.1. Gaussian case. We denote by 7 the standard Gaussian measure on M. d . Let 
H = g ■ 7, v = 7 and V<f> be the corresponding optimal transport. According to the 
result from Section 3 

d 



9 
Note that 



gdj 



9 



gd-y 



and 



\D 2 n 2 Hs9dl + Y, Tr[(D 2 $)- l DH Xk Y gd 1 . 
k=i J 

^^dj - 2 f(Vg, x)d 1 + J \x\ 2 gd 1 
\D 2 $ - Idling gd-y + 2 / A$ gd-y - d. 



\D 2 n 2 Hs9dj-- 
Apply integration-by-parts 

-2 J(Vg, x) d 1 + J \x\ 2 gd 1 = 2d - / |x| 2 <?d 7 . 
By the change of variables formula 

2 / A$ 3^7 - d = 2 / A$ gcfy - / |V$| 2 gcfy. 
Consequently 

y rz^L d7 = y y^ _ wn^^ + y (N 2 _ ^^ gdl + 2 y (A$ _ d)ffd7 

Taking the logarithm of the change of variables formula we get 



log 5 = — - 



logdetD 2 $. 



Applying this formula we get the heuristic proof of the following statement: 

Every probability measure g ■ 7 with smooth g and smooth V<I> satisfies the fol- 
lowing relation 

(19) I 7 5 = 2Ent 7 g - 2 / logdet 2 (L> 2 $ - Id) gd-y 

+ y \\D 2 <P - U\\ 2 HS gd 1 + J2 J Tr[(D 2 $)- l D 2 $ Xk ] 2 gd-y, 



fe=i' 



where I 7 g = J - — —dj (relative information), Ent 7 g = J g log g dj (relative en- 
tropy), det2(-D 2( I > — Id) = det D 2 $ • exp(d — A$) (the Fredholm-Carleman determi- 
nant of D 2 $>- Id). 

Remark 5.1. Since all the terms in the right-hand side are non- negative, this state- 
ment implies, in particular, the classical logarithmic Sobolev inequality 

I7S > 2Ent 7 g 
and the Gaussian analog of ([T]) 



(20) 



hg> I \\DH-ldf HS gd 1 . 
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Remark 5.2. Identity (|19[) holds, for instance, under assumptions: g is smooth, 
bounded, stricktly positive, I 7 g < oo, and —D 2 logg < c ■ Id. See Step 1 in the 
proof of Theorem 13.11 

Inequality (120[) follows immediately from Theorem l3. II under the unique assump- 
tion J 7 <? < oo. 



Remark 5.3. It was pointed out to the author by Michel Ledoux that (fT9|) implies 
the description of the extremals for the classical log-Sobolev inequality. Indeed, the 
case of equality in (TT91 is possible if and only if £> 2 $ = Id, hence V$ is linear and 
g has the form g = exp((h,x) — |||^|| 2 ), h € M. d . This result has been established 
by other methods in [9]. 

5.2. Log-concave case. Below we deal with the case p = ge~ w dx, v = e~ w dx, 
where W is convex. By the above results 
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VW 



gdp > / Tr 



L» 2 $ . D 2 W(S) ■ D 2 <5> 



gdp. 



Rewrite the left-hand side 



I — - VW 2 gdp = J 1^- dp -2 f{Vg, VW) dp + f \VW\ 2 gdp. 



Rewrite the right-hand side 



Tr 



Z? 2 $ • D 2 W{V<$>) -L> 2 $ 



(L> 2 $ - Id) • D 2 W(V$) ■ (D 2 $> - Id) 



gdp = / Tr 

2 / div(VW o V$) gdp - / AIT(V$) gdp 
/ Tr (£> 2 $ - Id) • D 2 W{V<5>) ■ (L> 2 $ - Id) gdp, 
2 f{Vg, VW o V$)d/i + 2 f {VW, VW o Vi>) g dp- f AW(V<$>) gdp. 



gdp 



Consequently 

|V. 9 | 2 
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dp+ VW #cfyi 



> 2 f(Vg,VW-VWoV$>) dp + 2 f (VW,VW oV$>) g dp 

(L> 2 $ - Id) • D 2 W{V§) ■ (D 2 $ - Id)] gdp - / AW(Vf) 5^. 



Tr 



This implies 

|2 



/i_JjL ^ + /|VW| 2 gdp- 2 f(VW 7 VWoV<P) g dp + f \VWoV<S>\ 2 gdp> 

> 2 /(V.g, VW - VW o V$) d^ 

+ / Tr (D 2 $ - Id) • D 2 W{V<S>) ■ (D 2 $ - Id) gdp + / [|VW o V$| 2 - AW(V3>)] fltf/z. 
Taking into account that 

/ \\VW oV$| 2 - AW(V$)] 5 d^ = /"[|VW| 2 -AI^] d^ = 
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we get 
(21) 
V 5 



— -(VW-VWoV®) gd/j,> / Tr (L> 2 $ -Id)- D 2 W(V$)-(D 2 $ -Id) gdfi. 



By the Cauchy inequality 

(22) 

|V. 9 | 2 
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d^+2 / V^-VWoV$ jd/i > / Tr 



(L> 2 $-Id)-L> 2 M/(V$)-(L> 2 $-Id) 



gd/i. 



Thus in order to estimate 



Tr 



(L> 2 $ - Id) ■ D 2 VF(V$) • (L> 2 $ - Id) 



5^ 



(or J ||D 2 <I> — Idling gdfi for uniformly convex W) it is sufficient to get a bound for 

VW -ViyoV$| 2 fifdju. 

Some estimates of quantities of this type are established in [5] . We give below the 
proof for the most simple case (the potential has a quadratic-like growth) . 



Theorem 5.4. Assume that for some K > 



K, 



W(x) - (VW(y),x- y) - W(y) > f\VW(x) - VW{y)\ 2 



and D 2 W > K ■ Id. Then 



K 
~2 



J \\D 2 <S> - Id||^ s gdiX <^Jg\ogg dfi + J I^L dp. 



In particular, the estimate holds for some K > if C\ ■ Id < D 2 W < Ci ■ Id. 

Proof. The result follows from Theorem 13.11 the above computations, and the es- 
timate below. The proof of the result can be easily reduced to the case of smooth 
g and T (see the proof of Theorem 13. ip . By the change of variables formula for 
T = (V*)" 1 = V$* one has 

log g (T)-W{T)+ log detDT = -W(x). 

Rewrite it in the following way 

logg(T) = W(T)-{VW(x),T(x)-x)-W(x)+\{VW(x),T(x)-x}-logdetDT 

Note that 



dfi 



TrDT-d-logdetDT 



(VW(x),T(x) -x) - logdet£>T 
Hence 

/ 'log g{T) dfi> J W{T) - {VW(x),T(x) - x) - W{x) 
By the change of variables 

glogg dfi> f \w{x) - (VW(V$(s)), x - V$) - W(V*)1 5(2;)^ 



d/x> 0. 



dfx. 



This inequality, (|22|) . and the assumptions of the Theorem imply the result. 
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□ 



6. L P -ESTIMATES AND THE CAFFARELLl'S THEOREM 

We generalize below the results of the previous sections and prove some corre- 
sponding L p -estimates. As a particular case we get the contraction result of Caf- 
farelli. Note that some dimension-free /^-generalizations of the Talagrand trans- 
portation inequality have been obtained in [5]. In particular, it was shown in [5] that 
||V$||l2 P ( m ) is controlled by J g\ logg| p dfi, p > 1 for any jx satisfying a log-Sobolev 
inequality. 

The proof of the result below follows the arguments of Theorem 14.31 That is 
why we omit the details and just give a short outline of the proof. 

Theorem 6.1. Assume that D 2 W > K ■ Id. Then for every unit e, p > 0, and 

P+2 J. 

r = i^— one has 

K\\$l e \\ L r W < \\(V ee ) + \\ L r W , 
K\\*lehr M <^\\V e 2 \\ L r M . 

Proof. Fix unit vector e, apply the change of variables formula and the uniform 
convexity of W 

V(x + te)-V{x) > (V$(a; + te) - V$(z), VW(V$(z))> 

K r 

+ — |V$(x + te) - V$(x)| 2 - log (dct Q D 2 $(a;))" 1 • dct a L> 2 $(a; + te) 

Multiply this identity by (St e $) p , where p > and 

6 te $ = $(x + te) + $(ai - te) - 2$(x) 
and integrate over fi. Integrating by parts we get 



(V$(x + te) - V$(x), VVF(V$(a;)))(<5 te $) p d/^ 
= f{V$(x + te) o (V*) - x, VW(a;))(<5t e $) 35 o (V*) dv 
> I (Ti[D 2 a ^(x + te)-(D 2 a <i>)- 1 ] o (V*) - dW e $) p o (W) dv 

+ p J /v$(x + te) o (V*) - x, (L> 2 *)V(5 fe $ o {W)V<$te$) p_1 ° (V*) dv. 

Applying the inequality TrA — d — log det A > which is valid for compositions of 
symmetric positive matrices we get 

K 



(V(x + te)-V(x))(6 te $) p dn> j I |V$(x + ie)-V$(x)| 2 (£ te $f d M 

+ p /7v$(x + te)- V$(i),(D 2 $)oV$(i)V(S( e $\(i5t e $) rl d/i. 
Applying the same inequality to —te and taking the sum we get 
(V(x + te) + V{x - te) - 2V{x)) (5 te ^>) p dfi 

> -J f |V*(» + te) - V$(x)| 2 (<5 te $) p dn+ — / |V#(» - te) - V$(x)| 2 (£ te $) p d,u 

+ P / , (V5 t e*,(^*)~ 1 V*te*)(*te*) p - 1 <*/*• 
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Note that the last term is non-negative. Dividing by t 2p and passing to the limit 
we obtain 

(23) /V ee $L dfi>K f \\D 2 $-e\\ 2 $ p ee dfi+p A( J D 2 $)- 1 V$ ee , V$ e e)$L _1 dfi. 

For the proof of the first part we note that 

V ee ^ ee dfi>K j ' ^+ 2 dy.. 

Applying the Holder inequality one gets 

\\{Vee)+\\ L ( P +2)/2 M \\<5> P ee \\ L i P+ 2 )/pM > j V ee <P P ee dfi. 

This readily implies the result. 

To prove the second part we integrate by parts the left-hand side 

j V ee $ P ee dfi =- P J Fe^eee^L" 1 dfi + J V^ ee dp 

= - P J(v$ ee , v e ■ e}^- 1 ^ + J v 2 ^ ee dp. 

By the Cauchy inequality the latter does not exceed 

p|((i) 2 $)- 1 V$ ee ,V$ e e)^- 1 dn + \jv?{{D 2 $)e,eW- x dfi + jv 2 ^ ee dfi. 
Inequality (|23p implies 

P±^J Ve 2 ^ ee dfi > K J \V$ e \ 2 &: e dfi > KJ^+ 2 dfx. 

The rest of the proof is the same as in the first part. 

□ 

Corollary 6.2. In the limit p — > oo we obtain the contraction theorem of Caffarelli 

K\\*ee\\L~M ^ IIC^)+IU~(A0- 

7. Operator norm estimates 

This section gives a partial answer to the question asked to the author by 
Emanuel Milman. Is it possible to estimate effectively (say, without dimension 
dependence) the operator norm of D 2 $? Estimates of this type would have inter- 
esting consequences for Sobolev-type inequalities of log-concave measures. 

Since the operator norm is controlled by the Hilbert-Schmidt norm, the previous 
results imply trivally the following estimate 



1»>K j \\D 2 $\\ 2 HS dfx>K j \\D 2 $\ 



dfi. 



We emphasize, however, that for many problems the assumption T^ < oo is too 
strong and leads to dimension dependent results. 

The main aim of this section is to show that for the uniformly log-concave v 



j \\{D 2 V) + \\dp >K j \\D 2 <5>\\ 2 dfi. 



lo 



Lemma 7.1. Assume that $ is smooth. Then for every smooth vector field v and 
every nonnegative test function r\ the following inequality holds 

/ (D 2 Vv, v)n dfi>K ||L> 2 $ • v\\ 2 n dfx + / {{D 2 <$>) v ■ v, (D 2 ^ 1 Vr?) d/x 



+ 2 / Tr((D 2 $) v ■ Dv ■ (D 2 ®)' 1 )^ dfj, + / Tr (D 2 *) - ^!) 2 ^,,] r\ dfi 

Proof. It follows from the change of variables formula V — VF(V$) — logdet_D 2 $ 
that 

V(x + tv) - V(x) = VK(V$(x + tv)) - W(V$(x)) 

- log (detD 2 ^(x)y 1 ■ detD 2 $(x + tv) 

By the ^-uniform convexity of W 

W(V$(x + tv)) - W(V$(x)) > (V$(x + tv) - V$(x), W(V#(x))) 

K 

+ — |V$(x + tv) - V$(x)| 2 . 

This implies 

(V(a; + tv) - V{x))n d\x > / — |V$(a; + to) - V$(x)| 2 r; d/u 

(V$(x + to) - V$(x), VW(V$(x)))?? d^i 



log 



(detD 2 $(x)) _1 • detD 2 $(x + to) 



r\ dfx. 



Denote by "f = $* the convex conjugated function of $. Using the fact that V\l/ 
and V$ are reciprocal we get 



(V$(x + to) - V$(x), VVK(V$(x)))r/ d^ 

(V$(x + to) o (V*) - x, VVK(x))?7(V*) d^ 
div(V$(x + to) o (V*) - x)n(V^)e~ w dx 
+ / (V$(x + to) o (V*) - x, L> 2 * • Vr/(V*)) dv. 
By the relation (D 2 $(V*)) _1 = D 2 * we get 
/ div(V$(V*(x) + to) - x)n(W)e- w 

= / (TrL> 2 $(x + to) o (V*) • {I + tDv) o (V*) • D 2 ^ - d)rj(W) dv 

= MTrL> 2 $(x + to) • (7 + tDv) ■ (D 2 ®^))- 1 - d)n d\x. 

Remark that 

T(x,to) =TrD 2 <S>(x+tv)-(D 2 <5>(x))- 1 -d-\og\dctD 2 <5>(x)-(dctD 2 <S>(x+tv))- 1 
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>0. 



Thus one obtains 



K 
(V(x + tv) - V(x))n dfi> — J |V$(x + tv) - V<$>(x)\ 2 rj dfj, 



+ t / Tt(D 2 ^{x + tv) ■ Dv(x) ■ (D 2 ^{x))- 1 )r ] dfi 

+ /(V$(x + to) - V$(x),(L» 2 *)Vr/(V*(a;))) dv + I T(x,tv)r} dfi. 



Now apply the same inequality to —tv, take the sum, and divide by t 2 . It can be 
easily verified with the help of the Taylor formula that 



lim T(x,tv)+T{x,-tv) = ^ 

i-K) t 2 



(D 2 *)- 1 ^ 2 *),,] >0. 

In the limit t — > one gets the desired inequality. D 



The proof of the Lemma 17.21 follows some elementary measure-theoretical argu- 
ments and we omit it here. It relies on the fact that the set of symmetric nonnegative 
matrices with multiple eigenvalue has smaller dimension in the ambient space of all 
symmetric nonnegative matrices. 

Lemma 7.2. Assume that $ is convex and twice continuously differentiable. For 
every e > there exists a matrix Q £ > such that \\Q S \\ < e and D 2 & + Q e has no 
multiple eigenvalues almost everywhere. 

Theorem 7.3. Assume that D 2 W > K • Id and {D 2 V)+ € L x [ji). Then the 
following inequality holds 



f\\(D 2 V)+\\ d^>K f \\D 2 $\\ 2 dfx. 



Proof. Step 1. Let $ be smooth. Fix a point Xq. Assume that D 2 Q(xo) has no 
multiple eigenvalues. Assume that v is a smooth field coinciding with the unit eigen- 
vectors of D 2 <& corresponding to the unique largest eigenvalue A in a neighborhood 
U XQ of x Q . Let us show that Tr(d v D 2 $ ■ Dv ■ (D 2 ®)' 1 ) > in U Xo . 
Indeed, one has 

D 2 <S>-v = X-v, \v\ = l. 
Differentiating both identities we get 

(Dvfv = 0, 

d v D 2 <S> + L> 2 $ • Dv = A • Dv + v © VA. 
Multiply (from the left) the second identity by (D 2 $) _1 • {Dv) T and take the trace. 
Taking into account that 

Tr(D 2 $)- l (Dv) T ■ v ® VA = {{D 2 <S>)- 1 {Dv) T v , VA) = 
one obtains 

Tr(D 2 ^)- 1 (Dv) T ■ d v D 2 <5> + Tv{D 2 ^)- 1 (Dv) T D 2 <i>- Dv = A • Tt(D 2 ^>)~ 1 (Dv) t ■ Dv. 
Finally we get 
Tr(d v D 2 <5> • Dv ■ (D 2 ®)- 1 ) = Tr(D 2 $) _1 (Du) T • d v D 2 <S> 

= Tr(D 2 ^)- 1 (Dv) T (XI - L> 2 $) • Dv 

= Tr(D 2 <i>)-^ 2 (Dv) T (XI - D 2 $) ■ Dv ■ {D 2 ®)- 1 ' 2 . 
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Note that the latter is equal to 

Ti(ABA T ), where A = {D 2 $)- 1/2 (Dv) T '(L> 2 $)- 1/2 , B = AL> 2 $ - (L> 2 $) 2 . 

Since A is the largest eigenvalue, B is symmetric and non-negative. This immedi- 
ately implies that 

Tr(a„L> 2 $ • Dv ■ (D 2 ®)- 1 ) > 

In particular, if supp(r/) C U Xo , we obtain from the previous lemma 

f \\{D 2 V)+\\ 2 ri d f i>K J X 2 ■r)dn+ j {{D 2 <S>) V ■ v, (D 2 ®)' 1 ^) dfi 

(24) + /'&[(D 2 *)- 1 (D a $) e ] a »jdjK. 

Step 2. Let us assume that $ is a convex polynom such that Z? 2 $ has no 
multiple eigenvalues almost everywhere. Recall that the set S :— S(D 2 §), where 
D 2 <i> has multiple eigenvalues, is the zero set of the discriminant of D 2 Q. Hence 
S is an algebraic variety. In particular, for W ^-almost every point x € dS the 
set Sfl B r (x) is diffeomorphic to R d_1 for sufficiently small r (see [3], Proposition 
3.3.14). Let M. d \ S — UDi, where every Di is a connected component of R d \ S. 
Clearly, one can choose a vector field of unit eigenvectors v corresponding to the 
largest eigenvalue of D 2 $ such that v\oi is smooth for every D t . 

By a classical result of Ky Fan [12] the function A — > A(A), where A(A) is the 
largest eigenvalue is convex on the set of symmetric matrices. This implies, in 
particular, that the function 

X(x) : x -> A(Z) 2 $(x)) 

has a directional derivative 

d e \( X ) = \im X{x + te) - X{x) 
t-K) t 

for every x and every direction e. For every regular point x G DDi wc define 

d 

Vu i A = ^2d Ci X-ei, 
i=i 
where the basis {e^} is chosen in such a way that x + tei G Di for the small values 
of t and every i. 
Note that 

<9 e (L> 2 $ • v, v) = (L> 2 $ e ■ v, u) + 2(L> 2 $ <9 e u, u) 

inside of -D^, Since v is an unit eigenvalue of _D 2 <I>, d e v is orthogonal to D 2 &v and 
one has 

d e \{x) = (D 2 $ e -v,v). 

Let us fix a compact domain B with smooth boundary and apply (l24l) to ^ = 
iDitiB- More precisely, we choose a sequence of smooth test functions {rjn} with 
supports inside of Di n -Br such that 77.; — > Id,db- One gets in the limit 

(25) 



f \\(D 2 V) + \\ 2 dfi>K J A 2 dn+ J Tr\(D 2 $)- l (D 2 $) v 



dp. 



f (V Di XdD 2 $)- 1 n Di )dn+ I (V Di X,(D 2 $)- l n B )dp, 

JdDiDB JDiDdB 



where ub is the inward normal to dB. 

Now take a regular point x £ dDi. Clearly, x belongs to the border between 
two sets Di and Dj, j =£ i and the inward normal of dDi can be computed in the 
following way 

" 4J ~||V Di A-V Dj A|r 

Taking the sum of ([25]) over i we get that the integral term over the boundary 
\J:dDi n B takes the form 



E 



dDiHdDjnB 



V D ,A-V Dj A,(Z5 2 $) 



df i 



and it is obviously non- negative. 
Taking the sum over i we get 



|(£> 2 V0 + || 2 dfi>K A 2 dp- 
Jb 



Tr 



(D 2 <^)- 1 {D 2 <^) V 



dji 



V/ (W Dl X,(D 2 ^>)- 1 n B )dfi. 

i JDiDdB 



Fix a smooth compactly supported nonnegative test function £. Applying the 
coarea formula and the above estimate applied to the level sets of £ one can easily 
get that 



Tr 



(D 2 <i>)- 1 (D 2 <i>) v 



£ dfi 



\(D 2 V) + \\ 2 £dfi>K / \ 2 £dfi- 
Applying the standard relations between the operator and Hilbert-Schmidt norms 



Tr 



(D 2 ^y 1 (D 2 ^>) 



{ \D 2 m 2 
and the Cauchy inequality one finally gets 

|V£| 2 



= \\(dH)- 1 / 2 (d 2 ^udH)- 1 / 2 \\ 2 hs 

> \\(D 2 *U 2 hs > IIP 2 ^11 2 



(26) 



||(^F) + ||^d M + 4 



L> 2 $ 



dfi>K A 2 £ d/i 



Choosing an appropriate sequence of compactly supported functions {£„} such that 



\YU1 



lim n £„ = 1 and lim„ J ■ v 5 nl d/i = we get the claim. 

Step 3. Here we prove the general case. In the same way as in Theorem 13. II one 



can approximate V and W by smooth functions with at most quadratic growth. 
Hence, one can assume without loss of generality that $ is smooth. To apply the 
previous step we fix a compact set B and choose a sequence of polynomial functions 
{<&„} such that $„ — > $ on B locally uniformly with all the derivatives up to the 
fourth order (this can be done by a multidimensional version of the Weierstrass 
approximation theorem) . 

Since we have convergence of the second derivatives, the functions $ n are convex 
for sufficiently big n. Applying Lemma [721 we may assume that S($ n ) has zero 
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measure. Note that the mapping V$„ sends e n dx onto fi, where 

V n = W{V$ n )-logdetD 2 $ n . 

From the convergence $ n — >■ 4> follows that V n ^ V uniformly in i3 and the same 
holds for the derivatives up to the second order. Passing to the limits one obtains 
([2S]) for V and any smooth test function £ . Choosing an appropriate sequence {£„} 
with £„ — >■ 1 one can easily complete the proof. □ 

The following result generalizes Theorem 17.31 in the same manner as Theorem 
16.11 generalizes Theorem 13. II The proof can be obtained by modifying the proof of 
Theorem 17.31 and we omit it here. 

Theorem 7.4. Assume that D 2 W > K • Id. Then for every r > 1 one has 

K[j\\DH\r d^y < (J \\(D 2 v) + w r d^y. 
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